Population class extension: HPC module
Module containing the functions to HPC functionality
These functions form a single API through which you can access HPC resources.
Generally, you should call an HPC function rather than the Slurm or Condor interface directly. The HPC function then decides which interface to use, so that all the other modules can use a single API rather than have to choose to use the Slurm or Condor API.
This class object is an extension to the population grid object
- class binarycpython.utils.population_extensions.HPC.HPC(**kwargs)[source]
-
Extension to the population grid object that contains functionality to handle handle the Moe & distefano distributions
- HPC_can_join(joinfiles, joiningfile, vb=False)[source]
Check the joinfiles to make sure they all exist and their .saved equivalents also exist
- HPC_check_requirements()[source]
Function to check HPC option requirements have been met. Returns a tuple: (True,””) if all is ok, (False,<warning string>) otherwise.
- HPC_get_status(job_id=None, job_index=None, hpc_dir=None)[source]
Get and return the appropriate HPC job (Condor or Slurm) status string for this job (or, if given, the job at id.index)
- Parameters
hpc_dir – optional HPC run directory. If not set, the default (e.g. slurm_dir or condor_dir) is used.
job_id – the id and index of the job to be queried
job_index – the id and index of the job to be queried
- HPC_grid(makejoiningfile=True)[source]
Function to call the appropriate HPC grid function (e.g. Slurm or Condor) and return what it returns.
- Parameters
makejoiningfile – if True, and we’re the first job with self.HPC_task() == 2, we build the joiningfile. (default=True) This option exists in case you don’t want to overwrite an existing joiningfile, or want to build it in another way (e.g. in the HPC scripts).
TODO: Exclude this function from testing for now TODO: Comment this function better
- HPC_id_filename()[source]
HPC jobs have a filename in their directory which specifies the job id. This function returns the contents of that file as a string, or None on failure.
- HPC_id_from_dir(hpc_dir)[source]
Function to return the ID of an HPC run given its (already existing) directory.
- HPC_job()[source]
Function to return True if we’re running an HPC (Slurm or Condor) job, False otherwise.
- HPC_jobID()[source]
Function to return an HPC (Slurm or Condor) job id in the form of a string, x.y. Returns None if not an HPC job.
- HPC_job_task()[source]
Function to return the HPC task number, which is 1 when setting up and running the scripts, 2 when joining data.
- HPC_job_type()[source]
Function to return a string telling us the type of an HPC job, i.e. “slurm”, “condor” or “None”.
- HPC_join_from_files(newobj, joinfiles)[source]
Merge the results from the list joinfiles into newobj.
- HPC_load_joinfiles_list(joinlist=None)[source]
Function to load in the list of files we should join, and return it.
If population_options[‘HPC_rebuild_joinlist’] is True, we rebuild it.
- HPC_make_joiningfile(hpc_jobid=None, hpc_dir=None, n=None, overwrite=False, error_on_overwrite=False)[source]
Function to make the joiningfile file that contains the filenames of results from each job. When all these exist, we can join.
Note: you normally don’t need to set any of the option arguments.
- Parameters
hpc_jobid – the job ID number, or self.HPC_jobID_tuple()[0] if None (default=None).
hpc_dir – the HPC directory, or self.HPC_dir() if None (default=None).
n – the number of jobs, or self.HPC_njobs() if None (default=None).
overwrite – if True, overwrite an existing joiningfile (default=False)
error_on_overwite – if True, and we try to overwrite, issue and error and exit (default=False)
- Returns
True if the file is made, False otherwise.
- HPC_restore()[source]
Set population_options[‘restore_from_snapshot_file’] so that we restore data from existing an HPC run if self.population_options[hpc_job_type+’_restart_dir’], where hpc_job_type is “slurm” or “condor”, is provided, otherwise do nothing. This only works if population_options[hpc_job_type] == self.HPC_job_task() == 2, which is the run-grid stage of the process.